Throughout this document, hover over the numbered annotations to the right of code chunks to reveal detailed explanations and comments about the code. Where drop-down italicized text is present, expand by pressing on arrow to see code.
Data Importation
Data Sources
Procedure
Step 1: Efficiently install packages and load libraries
if (!requireNamespace("pacman", quietly =TRUE)) {install.packages("pacman")}
renv was unable to query available packages from the following repositories:
- # http://www.stats.ox.ac.uk/pub/RWin/bin/macosx/big-sur-x86_64/contrib/4.3 --------
error downloading 'http://www.stats.ox.ac.uk/pub/RWin/bin/macosx/big-sur-x86_64/contrib/4.3/PACKAGES.rds' [error code 22]
error downloading 'http://www.stats.ox.ac.uk/pub/RWin/bin/macosx/big-sur-x86_64/contrib/4.3/PACKAGES.gz' [error code 22]
error downloading 'http://www.stats.ox.ac.uk/pub/RWin/bin/macosx/big-sur-x86_64/contrib/4.3/PACKAGES' [error code 22]
# Downloading packages -------------------------------------------------------
- Downloading pacman from CRAN ... OK [376.8 Kb in 0.87s]
- Downloading remotes from CRAN ... OK [419.1 Kb in 0.65s]
Successfully downloaded 2 packages in 13 seconds.
The following package(s) will be installed:
- pacman [0.5.1]
- remotes [2.5.0]
These packages will be installed into "~/Documents/Universiteit/University of Hawai'i/MSc Thesis/Thesis/RQ2/nomilo-fishpond-analysis/renv/library/R-4.3/x86_64-apple-darwin20".
# Installing packages --------------------------------------------------------
- Installing remotes ... OK [installed binary and cached in 0.55s]
- Installing pacman ... OK [installed binary and cached in 0.48s]
Successfully installed 2 packages in 1.1 seconds.
Step 1: Efficiently install packages and load libraries
create_vector_file_paths <-function(directory_path) {# List all files in the given directory path files_to_import <- fs::dir_ls(path = directory_path)# Loop through the files and print each with an indexfor (i inseq_along(files_to_import)) {cat(i, "= ", files_to_import[i], "\n") }# Return the vector of file pathsreturn(files_to_import)}files_to_import <-create_vector_file_paths("data/raw")
The @iteratively-import-raw-data code chunk should only be ran once when raw data is updated because it takes long to execute. Therefore, run the @efficiently-load-raw-data code chunk instead to easily import up-to-date raw data.
Step 3: Use the purrr::map() function to iteratively import files in the files_to_import vector except for the profiles data and .RData files
Refer to the output of the files_to_import data object to ensure you are inputting the correct index value corresponding to the file path that needs to be loaded.
Step 4: Efficiently import up-to-date raw data
base::load(files_to_import[10])
We will always use snakecase when naming our data objects and functions (e.g., data_object_name or function_name()).
Iterate the export_to_csv(df, df_name, dir_path) function over each dataframe. .x refers to the dataframe. .y refers to the name of the dataframe. These are passed to export_to_csv() function along with the desired directory path.
Export merged final data set into data/outputs folder